Search results for "speech recognition"
showing 10 items of 357 documents
Extraction of the mismatch negativity elicited by sound duration decrements: A comparison of three procedures
2009
This study focuses on comparison of procedures for extracting the brain event-related potentials (ERPs) - brain responses to stimuli recorded using electroencephalography (EEG). These responses are used to study how the synchronization of brain electrical responses is associated with cognition such as how the brain detects changes in the auditory world. One such event-related response to auditory change is called mismatch negativity (MMN). It is typically observed by computing a difference wave between ERPs elicited by a frequently repeated sound and ERPs elicited by an infrequently occurring sound which differs from the repeated sounds. Fast and reliable extraction of the ERPs, such as the…
Evaluation of Support in Singing
2005
Summary This study searched for perceptual, acoustic, and physiological correlates of support in singing. Seven trained professional singers (four women and three men) sang repetitions of the syllable [pa:] at varying pitch and sound levels (1) habitually (with support) and (2) simulating singing without support. Estimate of subglottic pressure was obtained from oral pressure during [p]. Vocal fold vibration was registered with dual-channel electroglottography. Acoustic analyses were made on the recorded samples. All samples were also evaluated by the singers and other listeners, who were trained singers, singing students, and voice specialists without singing education (a total of 63 liste…
Alignment Free Dissimilarities for Nucleosome Classification
2016
Epigenetic mechanisms such as nucleosome positioning, histone modifications and DNA methylation play an important role in the regulation of cell type-specific gene activities, yet how epigenetic patterns are established and maintained remains poorly understood. Recent studies have shown a role of DNA sequences in recruitment of epigenetic regulators. For this reason, the use of more suitable similarities or dissimilarity between DNA sequences could help in the context of epigenetic studies. In particular, alignment-free dissimilarities have already been successfully applied to identify distinct sequence features that are associated with epigenetic patterns and to predict epigenomic profiles…
Going in Homer: The Role of Verb-Inherent Actionality Within Self-Propelled Motion-Event Encoding
2019
The paper aims at investigating the encoding of self-propelled motion events in Homeric Greek in the light of the typology of motion events, taking into account the case of to go. The verbal class of the self-propelled motion refers to those verbs expressing the idea of a simple translational motion, such as to go, to move, without any information about the manner of motion (see, by contrast, the class of the manner-of-motion verbs, such as to run, to swim) or about the path of motion (see, by contrast, the class of the path verbs, such as to enter, to exit). According to Talmy (2000), world languages can be distinguished depending on whether they prototypically express the semantic compone…
2013
To identify factors limiting performance in multitone intensity discrimination, we presented sequences of five pure tones alternating in level between loud (85 dB SPL) and soft (30, 55, or 80 dB SPL). In the “overall-intensity task”, listeners detected a level increment on all of the five tones. In the “masking task”, the level increment was imposed only on the soft tones, rendering the soft tones targets and loud tones task-irrelevant maskers. Decision weights quantifying the importance of the five tone levels for the decision were estimated using methods of molecular psychophysics. Compatible with previous studies, listeners placed higher weights on the loud tones than on the soft tones i…
Analysis of neuronal networks in the visual system of the cat using statistical signals--simple and complex cells. Part II.
1978
Superimposing additively a two-dimensional noise process to deterministic input signals (bars) the neurons of area 17 show a class-specific reaction for the task of signal extraction. Moving both parts of the signals simultaneously and varying the signal to noise ratio (S/N) the simple cells achieve the same performance as resulted from the psychophysical experiment. Type I complex cells extract moving deterministic signals (i.e. bars) from the stationary noise, whereas in the answers of Type II complex cells the statistical parts of the signals predominate. Considering the different cell types each as a series of a linear and a nonlinear system one obtains the cell specific space-time freq…
Analyzing Learned Representations of a Deep ASR Performance Prediction Model
2018
This paper addresses a relatively new task: prediction of ASR performance on unseen broadcast programs. In a previous paper, we presented an ASR performance prediction system using CNNs that encode both text (ASR transcript) and speech, in order to predict word error rate. This work is dedicated to the analysis of speech signal embeddings and text embeddings learnt by the CNN while training our prediction model. We try to better understand which information is captured by the deep model and its relation with different conditioning factors. It is shown that hidden layers convey a clear signal about speech style, accent and broadcast type. We then try to leverage these 3 types of information …
Soundscape design through evolutionary engines
2008
Abstract Two implementations of an Evolutionary Sound Synthesis method using the Interaural Time Difference (ITD) and psychoacoustic descriptors are presented here as a way to develop criteria for fitness evaluation. We also explore a relationship between adaptive sound evolution and three soundscape characteristics: keysounds, key-signals and sound-marks. Sonic Localization Field is defined using a sound attenuation factor and ITD azimuth angle, respectively (Ii, Li). These pairs are used to build Spatial Sound Genotypes (SSG) and they are extracted from a waveform population set. An explanation on how our model was initially written in MATLAB is followed by a recent Pure Data (Pd) impleme…
Breaking down the word length effect on readers’ eye movements
2015
Previous research on the effect of word length on reading confounded the number of letters (NrL) in a word with its spatial width. Consequently, the extent to which visuospatial and attentional-linguistic processes contribute to the word length effect on parafoveal and foveal vision in reading and dyslexia is unknown. Scholars recently suggested that visual crowding is an important factor for determining an individual’s reading speed in fluent and dyslexic reading. We studied whether the NrL or the spatial width of target words affects fixation duration and saccadic measures in natural reading in fluent and dysfluent readers of a transparent orthography. Participants read natural sentences …
Semi-blind Independent Component Analysis of functional MRI elicited by continuous listening to music
2013
This study presents a method to analyze blood-oxygen-level-dependent (BOLD) functional magnetic resonance imaging (tMRI) signals associated with listening to continuous music. Semi-blind independent component analysis (ICA) was applied to decompose the tMRI data to source level activation maps and their respective temporal courses. The unmixing matrix in the source separation process of ICA was constrained by a variety of acoustic features derived from the piece of music used as the stimulus in the experiment. This allowed more stable estimation and extraction of more activation maps of interest compared to conventional ICA methods.